rank | frequency | n-gram |
---|---|---|
1 | 9005 | -и |
2 | 3209 | -о |
3 | 2303 | -ӣ |
4 | 2230 | -а |
5 | 2129 | -н |
rank | frequency | n-gram |
---|---|---|
1 | 1500 | -ои |
2 | 1465 | -ро |
3 | 1184 | -ии |
4 | 1033 | -аи |
5 | 994 | -ни |
rank | frequency | n-gram |
---|---|---|
1 | 1238 | -ҳои |
2 | 524 | -они |
3 | 502 | -анд |
4 | 258 | -ати |
5 | 230 | -ори |
rank | frequency | n-gram |
---|---|---|
1 | 327 | -аҳои |
2 | 152 | -тҳои |
3 | 116 | -онро |
4 | 103 | -рони |
5 | 102 | -анда |
rank | frequency | n-gram |
---|---|---|
1 | 64 | -тарин |
2 | 64 | -дааст |
3 | 62 | -атҳои |
4 | 61 | -андаи |
5 | 60 | -истон |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings